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Abstract 

A scenery is a coloring £ of the integers. Let {St}t>o be a recurrent random walk 
on the integers. Observing the scenery £ along the path of this random walk, one 
sees the color xt '■= £,(St) at time t. The scenery reconstruction problem is concerned 
with recovering the scenery £, given only the sequence of observations \ := (xt)t>o- 
The scenery reconstruction methods presented to date require the random walk 
to have bounded increments. Here, we present a new approach for random walks 
with unbounded increments which works when the tail of the increment distribution 
decays exponentially fast enough and the scenery has five colors. 
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1 Introduction 



Consider a coloring of the integers £ : Z — > {1,2,3,4,5} which we shall call a scenery. 
Let S be a recurrent random walk starting at the origin. We assume that we observe the 
scenery along the path of the random walk S, that is, we observe the color xt '■= £(<St) at 
time t. The scenery reconstruction problem is concerned with determining the scenery £ 
using a single realization of the color record 

X ■= (Xo,Xi,X2, • • •)• 

The scenery reconstruction problem can also be formulated as follows: Does one path re- 
alization of the process {xt}t>o uniquely determine £? The answer in such general terms 
is "no". However, under appropriate restrictions, the answer becomes "yes". Firstly, 
we can at best hope to be able to reconstruct the scenery up to translation and reflec- 
tion. Secondly, there are sceneries which can not be reconstructed. Lindenstrauss in [15] 
exhibited sceneries which cannot be reconstructed. Only 'typical' sceneries can be re- 
constructed. Sceneries that can be obtained from one another by shift and reflection are 
called equivalent and we shall use the symbol ~ to denote their equivalence. 

The scenery reconstruction problem arose from questions posed by Kesten, Keane, 
Benjamini, Den Hollander and others. It also falls into the research area concerned with 
the investigation of the ergodic properties of the color record x- One of the motiva- 
tions for studying scenery reconstruction comes from ergodic theory, for example via the 
T, T _1 problem; see Kalikow [8J. The ergodic properties of the observations % were stud- 
ied by Heicklen, Hoffman, Rudolph in [I], Kesten and Spitzer in [12], Keane and den 
Hollander in [9], den Hollander in [2] and den Hollander and Steif in [I]. 

A related important problem is the distinguishing of sceneries: Benjamini, den Hol- 
lander, and Keane independently asked whether all non-equivalent sceneries could be 
distinguished. We give a brief outline of this problem. Let rji and 772 be two given scener- 
ies. Assume that either 771 or 772 is observed along a random walk path, but we do not 
know which one. Can we tell which of the two sceneries was observed? Kesten and 
Benjamini proved that one can distinguish almost every pair of sceneries, even in two 
dimensions and with only two colors. Before that, Howard had proved in [5], [B], and [7] 
that any two periodic one dimensional non- equivalent sceneries are distinguishable, and 
that one can almost surely distinguish single defects in periodic sceneries. The problem 
of distinguishing two sceneries which differ only in one point is called "detecting a single 
defect in a scenery". Kesten in [TU] proved that one can a.s. recognize a single defect in 
a random scenery with at least five colors. He asked whether one can distinguish a single 
defect even if there are only two colors in the scenery. 

Kesten's question was answered by Matzinger in his Ph.D. thesis [18] Given that the 
colors in the scenery are taken to be i.i.d. uniformly distributed, he showed that almost 
every 2-color scenery can be almost surely reconstructed up to equivalence. In [T5] , 
Matzinger proved that almost every 3-color scenery can be almost surely reconstructed. 
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Kesten jTT] noticed that the proofs employed in [18] and [19] rely heavily on the skip-free 
property of the random walk as well as the one-dimensionality of the scenery. He asked 
whether the result might still hold in more general situations. 

In [16], Matzinger and Lowe showed that one can still reconstruct sceneries in two 
dimensions, provided there are sufficiently many colors. In |23j, Rolles and Matzinger 
adapted the method proposed by Lowe, Merkl and me to the case where random errors 
occur in the observed color record. They showed that the scenery can be reconstructed 
provided the probability of the errors is small enough. When the observations are seen 
with random errors, the reconstruction of sceneries is closely related to some coin tossing 
problems. These have been investigated by Harris and Keane [3] and Levin, Pemantle 
and Peres |14j . 

This paper deals with the problem of whether one can reconstruct a scenery seen along 
the path of a random walk having unbounded jumps, a question which was asked by den 
Hollander. Our main result shows that we can a.s. reconstruct a five-color (random) 
scenery seen along the path of a random walk S with unbounded jumps, provided the 
probability of making a jump of non-unit size is not too high and the tail of the increment 
distribution of the random walk decays exponentially fast enough. By unbounded jumps, 
we mean that the support of the random walk's increment distribution is not bounded. 

Methods for carrying out Scenery reconstruction differ greatly depending on the dis- 
tribution of the scenery and the nature of the random walk. The methods for scenery 
reconstruction for a simple random walk ([IE], [IS], [20], [IS], [21], [22], [23]), together 
with the methods appropriate when the support of the increment distribution is bounded 
( [IT] and [IE]), fail in the setting of "unbounded jumps". It was not possible to adapt 
existing methods to the present situation. The approach utilized here is fundamentally 
different from those which have been fruitfully applied in the setting of random walks 
having "bounded jumps". 

We begin by explaining the setting considered in this article. Let c > be an exponen- 
tial decay rate and e > the probability of jumping a distance that is not of unit length. 
We shall impose the following conditions on the random walk S: 

P(\S t+l -S t \^l)=e (1) 

and 

P(\S t+1 -S t \=i I \S t+1 - S t \ ^ 1) < e" c \ (2) 
We shall also require that the distribution of the random walk be symmetric: 

P(S t+ i -S t = i)= P(S t+ i -S t = -i). (3) 

This condition could be replace by the weaker condition E[St + i — S t ] =0, but symmetry 
simplifies notation. Finally, to ensure that the random walk is non-periodic, we shall 
stipulate that 

P{S t+1 - S t = 0) > 0. (4) 
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We can now formulate the main theorem of the paper. 

Theorem 1.1 Assume the scenery £ : Z — > {1,2,3,4,5} is i.i.d. with the five colors 
uniformly distributed. If Conditions ^)-^ are all satisfied, then for every e > small 
enough and c > large enough, we can a.s. reconstruct the scenery £. In other words, 
there exists a measurable map 

^l:{l,2,3,4,5} N ^{l,2,3,4,5f 

such that 

The principal difficulty with proving the theorem is in showing that we can recon- 
struct the scenery at one point given that we have already managed to reconstruct the 
scenery on an interval. The next three sections of the paper are dedicated to obtaining 
an estimate of + 1) given the observations x an d the restriction of the scenery 
£|[-n,n] := £-n£-n+i ■ ■ • £n-i£n- Section [2] uses three simplified problems to introduce key 
concepts that shall be required in the sequel. In Section EH we describe the algorithm for 
retrieving £ n+1 from x given that we know a portion of the scenery. The follow- 

ing section then proves that the single-point reconstruction algorithm succeeds, that is, 
£n+i = £n+i) with high probability. This is done by showing that the probability of failure 
is finitely summable, that is, 

^P(£(n+l)^f(n + l))<oo. (5) 

n 

Finally, we conclude by showing in Section [5] that finite summability implies that the 
whole scenery £ can be reconstructed almost surely up to equivalence. 

2 Three simplified problems 

To illustrate the key concepts needed for performing scenery reconstruction on 5-color 
sceneries observed by random walks with unbounded jumps, we begin by presenting three 
simplified problems and their solutions. Each solution highlights one key idea which we 
shall use later. 

2.1 Reconstructing one point when the scenery is observed at 
i.i.d. locations 

Take any non random scenery £ : N — > {1,2,3,4,5}. Let a and b be two integers with 
a < b and define / := [a, b]. Let Y±, Y2, . . . be i.i.d. random variables such that 

P(Y i = b + l)>P(y i $[a,b + l]). (6) 
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The first simple problem we consider, is to reconstruct the scenery £ at a single point, 
namely 6 + 1. So we need to determine the value of £(6 + 1). For this, we suppose we are 
given two things: The restriction of £ to the interval J, that is, £|j := (£ji G I), and an 
infinite sequence of observations of the scenery £ at the random locations Yj, 

eauw),.... (7) 

Now, for any e G {1, 2, 3, 4, 5}, we have 

P(£(Y) = e) = P(£(Y,) = e,YiE [a, 6])+% +1)=e P(F i = 6+l)+P(Y £ [a, 6+1], £(Y*) = e). 
Let g e be the quantity 

g e := P(£(Y;) = e) - P(£(Y) = e, Y G [a, 6]). (8) 

When inequality[6]holds, two possibilities manifest. If £(6+1) = e, then q e > P{Yi = b+1). 
On the other hand, if £(6 + 1) ^ e, then q e < P(Yj — b + 1). We can use this dichotomy 
to reconstruct £(6+1) as follows: 

Estimate q e and take the element e G {1,2,3,4,5} which maximizes it for the color at 
point 6+1. 

To determine q e , we first estimate the probability P(£(Yj) = e) from the sequence 
[3 The probability P(£(Yj) = e,7j £ [a, 6]) can be calculated, since we are given the 
restriction of £ to /. For this, we also assume that the distribution of the Yj's is known 
to us. 

Note that Condition [6] holds automatically when the tail of Yi decays exponentially 
with decay rate r strictly less than 1/2 and 

a < < 6 and \a\ > |6|. (9) 

To see this, let the Yj's have a symmetric distribution with 

P(Y l = y)-r>P(Y t = y + l) , \/y G N, (10) 

where r < 0.5. Assuming that \a\ ^> |6|, P{Yi < a ) i s negligible in comparison to 
P(Yi = 6 + 1). Furthermore, thanks to the exponential decay, we obtain 

P(Yj>b + l) _j_ 
P(Yi = 6+l)-l-r < ' 

Finally, if P(Yj < a) is small enough then this last inequality implies Condition [6j 

2.2 Reconstructing a point when the scenery is seen along an 
infinite number of random walks 

Once again assume that £ is a non-random scenery. As in the previous setting, the 
restriction of £ to [a, 6] is known and we try to reconstruct £(6 + 1). We also assume that 
condition [9] holds. Let 

{St }f S N, {St }t e N) • • • 
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be a sequence of random walks independent of each other and all of which start at the 
origin. Each of the S"s is identical to the random walk St, which we define to have the 
following increment distribution: 

{e, if x = 0, 
7, if M = 1, (11) 

0, otherwise, 

where 7 = and t 6 M. Thus, each SI is a simple symmetric random walk modified to 
allow sojourns of duration greater than unity in any state. This is the simplest random 
walk satisfying Conditions [THU Let x % denote the observations of the scenery £ made by 
random walk number i: 

X l :=(xi t = 0,1,2,...), 

where xl := l n addition to knowing the restriction £|r a ,&], we assume that all the 

observations x 1 , x 2 , x 3 , . . . made by the different random walks are given to us. 

Take q such that 1 < q < 3. In order to apply the reconstruction technique from the 
previous section, we set Yj to be equal to Yj := 5**, where r = qb is an integer. The state 
distribution of the symmetric random walk St is given by 



ptg = x \ = f S«6[o,t-x] : s+t is even (J {(t-s+x)/2) e 7 5 ^ M — 

\ 0, otherwise. 



Provided \x\ <t, the decay rate of St at the point x is given by 
P(5 t = x + 1) E SG [o,t-a:-i] : s+t is even 



p(*,x) 



< 



[o.t-a-] : s+i is even (!) {(t-l+x)^)^^ 

V • \ (t-s-x)/2 s t-s 

Z^se[o,t-x] : s+t is even \ s ) \(t-s+x+i)/2J (t-s+x+i)/2 tL ' 

EaefO.t-a;! : s+t is even (s) ((t-s+a;)/2) eS ^'' 



^se[o,i-x] : s+t is even Vs7 \{t-s+x)/2 
< max = 

s£[0,t-x] t — S + X + 1 t + X + 



Then, 



= /<*») = ?+ T + 17i<f + 4- < 12 > 

For g G (1,3), The final expression is always less than 1/2 and, since q does not depend 
on b, the tail of 5 r decays at a rate less than 1/2 beyond the point b. Hence, condition 
fTUl is satisfied and we can apply the reconstruction technique described in the previous 
subsection. 



So far, we have described how to perform a one-point reconstruction when presented 
with an infinite set of realizations of the random walk S. However, in the problem under 
consideration in this paper, we only have access to the observations made by a single 
random walk S. We get around this limitation by using stopping times to restart the 
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random walk in a predetermined way. This technique yields the infinite set of realizations 
we require. Let Tj denote the time of the i-th visit by S to the origin. Then due to 
the strong Markov property, the random walk after time Tj behaves like a random walk 
starting at the origin. Hence, if we are given the piece of scenery £|[ 0) &] together with 
observations x an d the sequence of stopping times t±,T2, . . ., then we can use the recon- 
struction technique for an infinite number of random walks. For this we simply take the 
sequence of observations x % made by the i — th random walk to be 

Xni Xn+ii Xn+2, • • • j Xn +1 -i- 

2.3 Finding the way back to the origin 

Because we are only given the sequence of observations made by the random walk, we 
are not able to determine when it is at the origin and hence the stopping times Ti, T2, ■ ■ ■ 
are not observable. Instead, we must construct stopping times based on the observations 
which are able to stop the random walk close to some point of reference. 

Let K = [ki,k2] C [— n, n] be an integer interval of length n. The precise position of 
K within [—n, n] is not important here and will be specified later. Our point of reference 
will be the location of the finite string w := 6si6si+i • • -£fc 2 - Let Tj be the i-th time that 
we observe the pattern w in the observations x- Hence 

Ti := min{t > n : X{t-n)X(t-n+i) ■ ■ ■ X(t) = w} (13) 

and 

Ti+i := min{t > n : X(t-n)X(t-n+i) ■ ■ ■ X(t) = w}. (14) 
Let R be a map R : [0, n] — > Z. We call R a simple random walk path, if 

Vt G [0,n-l] , \R(t + 1) - R(t)\ = 1. 

If R(0) = x, we say the path R starts at x. We assume that the scenery £ is random, 
i.i.d. and all five colors appear with equal probability. Let x ^ [— 2n, 2n]. Let R be a 
non-random simple random walk path starting at x. What is the probability that £ seen 
along the path R gives the string w? In other words, what is the probability that 

ZoR:=Z(R )^R 1 )...^R n ) = w7 

Note that since x £ [— 2n, 2n], the path of R can not reach [— n, n\. Since the scenery is 
i.i.d. and since w only depends on £|[_ nj7l ], the pattern w is independent of (,oR. The five 
colors in the scenery having the same probability, we find 



P(f o R = w ) = 
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There are 2 n simple random walk paths starting at x of length n. Hence the probability 
that there exists such a path generating the color record w is bounded: 

/l\ n+1 (2\ n 

P(3R a simple r.w. path starting at x such that £ o R = w) < I - ) • 2 n < I - J . 



It follows that with probability close to one, every time we observe the pattern w before 
the time T = p n , the random walk must be in the interval [— 2n, 2n]: 

S T . G [— 2n, 2n] for every t, < T, 

where p > is a constant not depending on n such that p < (5/2) 2 . 

We shall see that this argument can be refined so that the interval in which S Ti lies is 
much narrower. Also, we shall be dealing with non-simple random walks. Hence it will 
be necessary to adapt the present argument to that situation. 



3 Single Point Reconstruction 

In this section, we describe the algorithm for reconstructing £(n + 1) given the obser- 
vations x an d the restriction £|[_ n)n j. Before beginning, it is useful to consider a small 
numerical example to illustrate. 



3.1 A numerical example 

Assume that the random walk is simple and suppose we are given the finite restriction of 
£ to the interval I := [-4,4]: 
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We wish to reconstruct the value £(5). Let K :— [0, 3] and let w be the restriction of £ to 
K: 

w := 4515. 

Occurrences of the pattern w in the observations x define a sequence of stopping times 
iji) like that defined by (TT3T) and ( HUH) . Note that if we observe the pattern w and this 
was generated while the random walk was in the interval J, then we must be located at 
either z = 1 or z = 3. More precisely, if S s G / for all s G [r^ — 3, Tj], then SV 4 G {1,3} 
and the random walk will be at one of the points 1 or 3 with equal probability. Let p 
denote the two- atom measure which accords probability 1/2 to {1} as well as to {3}. 
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Let P /x (.) denote the probability distribution for the random walk S starting with initial 
distribution /i instead of starting at the origin. 

Take r = 4. As was done in (jSJ), we define 

q e := P, (£(S r ) =e)-P„ (aS r ) = e, £(S r ) G [-4, 4]) . (15) 

Then, we look at the empirical frequency of the colors at a fixed offset r following each 
stopping time 7$. More precisely, we estimate P M (£(SV) = e) = P^(Xr = e) by the empirical 
distribution of 

where j is the largest integer i such that r, < T. Let us denote this empirical distribution 
by 

P,(Xr = ■)• (16) 
We can now describe the reconstruction algorithm. 

Algorithm for single point reconstruction. Given ^ | [—4,4] and \i our estimate of ^(5) 
is the element e of the set {1, 2, 3, 4, 5} which maximizes the quantity 

q e := P^Xr = e)-P, {t{S r ) = e,S r e [-4, 4]) . 



Note that the second term in the definition of q e can be calculated explicitly because 
C| [-4,4] is given. 

To verify that the above algorithm has a high probability of estimating £(5) correctly, 
we first need to check that 

P ll (S r = 5)>P fl (S t t[-4,5]). (17) 
This inequality corresponds to Condition [6j Taking r = 4 we find 

P,(S r = 5) = A P^ Sr E (-oo, -5] U [6, oo)) = ^ 

and so inequality [T7] is satisfied. 

The second problem we need to take care of is verifying that the estimate [16] is precise 
enough with high probability. For the algorithm to work, the estimation error needs to 
be strictly less than 

P tl (S r = 5)-P II (S r ^[-4,5}) 

2 1 ' 

For this we need to show that there exists T such that the following two conditions are 
satisfied with high probability: 
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• The time T is large enough so that there are enough visits by the random walk to 
the origin generating the pattern w up to time T. 

• The time T is not too large, since otherwise during the time interval [0,T] the 
random walk will wander too far away from the origin. Far from the origin there 
are other places where the pattern w can be generated. Thus, too large a T results 
in S n being distant from the origin at some of the times Tj G [0,T]. 

In the numerical example used to illustrate above, n has been taken too small for the 
required estimate precision to be obtained. One of the main issues in the general context 
is to show that for n large enough, we can find a T such that with high probability the 
two conditions above can be satisfied simultaneously. 

3.2 The one-point reconstruction algorithm 

Now we present an algorithm for estimating £(6 + 1) given the restriction £][- n ,n] an d the 
observations x- 

Unlike the numerical example above, our problem is to reconstruct the scenery when 
the random walk is not simple, for this we shall need the concept of a 5-path. The key 
feature of a 5-path is that it behaves like a simple random walk most of the time and the 
proportion 5 of time that it doesn't do so, its freedom of movement is linearly restricted 
pro rata. 

Definition 3.1 Let 5 > 0. We call R : [0, n] — > Z a 5-path (of length n) if the proportion 
of steps of non-unit length as well as their total variation is less than 5, that is, 

\M\ < 5n 

and 

^2\R(t + l) - R(t)\ <5n, 
teM 

where the set M is defined by 

M := {t E [0,n- 1] : \R{t + l) -R(t)\ ^ 1}. 

For the Reconstruction, we shall use the stopping times 7$ that occur prior to time 
T = 2.4 2n . We will subsequently show that by taking the parameter e > small enough, 
the random walk can be made to follow 5-paths for any time interval of length n before 
time T. 
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Our algorithm for reconstructing £(n + 1) is defined using three intervals. First 

/ := [—ra, n] 

is the interval on which we already know the scenery £. Let 

n* := n — 615n 

where 5 > is a small constant not depending on n. The second interval 

K = [ki, k^\ := [n* — n, n*] 

is the interval from which we take the pattern 

w:=S(k 1 )S(k 1 + 1)^ + 2)... Sib). 

Finally we will show that after reading the pattern w close to the origin we are typically 
located in the interval 

J := [n* -2lSn,n* + Sn\. 

Next, let Vi be the i-th time the random walk generates the pattern w on the piece of 
scenery f| [_„,„]: 

v x := min{t > n : Xt-nXt-n+i ■ ■ ■ Xt = w ; Vs e [* - n,t], S s e [-n, n]} (19) 

and 

u i+1 := min{t > ^ : Xt-nXt-n+i ■ ■ ■ Xt = w , S s e [-n,n], Vs e [t - n, t]}, % > 1. 

(20) 

The difference between the z/j's and the Tj's is the additional constraint on the i/j's that 
the pattern should be generated while the random walk is in [—71,71]. This is apriori not 
observable. But we will show that the Tj's and the z/j's coincide up to time T with high 
probability. 

Note that given £, the sequence 

q q q 

is a Markov chain whose state space is [—71,71]. The transition probability from x to y is 
given by 

Pxy '■= P(S Ui+1 = y I S Vi — x,£). 

This chain is aperiodic and irreducible. Denote its stationary distribution by fx. 
Hence, \x depends on ^ and is thus a random measure. As before, denotes the 

measure for which the random walk S has starting distribution fi and for which the dis- 
tribution of the scenery remains unchanged and is independent of the random walk. We 
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take r := 905n and look at the frequency of the observed color r time steps after a stopping 
time. Our estimate for the distribution (x r ) is the empirical distribution of 

£{Sn+r) , £,(S T2+r ) , . . . ,£{S Tj+r ), 

where j denotes the largest integer for which T{ < T. We shall denote this empirical 
distribution by 

P»(Xr = •)• 

We are now ready to describe the reconstruction algorithm for estimating the value 
e(n + l). 

Algorithm: Given £|[- n ,n] and x, the estimate £(n + 1) of £(n + 1) is the element e G 
{1, 2, 3, 4, 5} which maximizes the quantity 

Qe ■= Pix{Xr = e)-P li (£(S r ) = e , S r E I) . 

Note that the second term on the right-hand side of the definition of q e can be calculated 
explicitly since we know 

3.3 Combinatorial aspects of the one point Reconstruction al- 
gorithm 

Let A n denote the event that the above reconstruction algorithm works correctly, that is 
the event that + 1) = + 1). We list a few events which are important in making 
the reconstruction of £(n + 1) work. 

B n : Performing reconstruction will require the random walk to remain near the origin, in 
some sense, for a sufficiently long time. To capture this requirement, let B n be the 
event that the random walk stays in the interval [—2.45™, 2.45 n ] for all times prior 
to and including T = 2.4 2n . 

C n : In order to obtain sufficiently precise estimates, we will need enough observations 
close to the origin. This is taken care of by C n which is the event that at least 
(l.l) n stopping times z/j occur before time T: 

C n := {vi < T : i< 1.1™}. 

D n : We take D n to be The event that, up to time T, all pieces of length n of the path S 
are 5-paths. More precisely, D n is the event that for all s e [n, T] , we have 

S : [s — n, s] — ?■ Z t (->■ St 

is a 5-path. 
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F n : Let F n be the event that in the interval [— 2.45 n , 2.45 n ] a 5-path can generate w only 
if it ends in the interval J and does not leave the interval /. More precisely, F n is 
the event that, For all 5-paths 

R : [0,n] -»■ [-2.45™, 2.45™] 

for which £ o R = w, we have R(ri) G J and R(l) G [— n, n], for all I G [0, n]. 

G n : Let G n be the event that the precision of the estimate based on the z/j's is better 
than 

P,(S r =n + l)-P,(S r ({ [-n,n + l] 

2 • 1 j 

More precisely, denote the empirical distribution of 

X(yi + r )> X(^2 + r), . . . , + r) 
where j is the largest i for which z/, < T, by 

P»(Xr = ■)• 

Recall that r is defined to be 905n. Then G ra is the event that, for all e G 
{1, 2, 3, 4, 5}, the difference 

^/i(Xr = e) - P M (xr = e) 
is strictly less than the expression given in |2TJ 

The following lemma shows that when all the events B n through G n hold, then we can 
reconstruct £(n+ 1) correctly. 

Lemma 3.1 Assume that expressionl21\ is strictly positive. Then, 

B n n C n n £> n n F n n G n c A n . 

Proof. If B n holds, then the random walk stays within the interval [—2.45™, 2.45 n ] for 
all times up to and including T. However, thanks to the event F n , a £-path within that 
interval can generate the pattern w only if it stays within the interval [— n, n}. This means 
that when B n and F n both hold, = z/j for all Tj < T. In this situation, the estimates 
of P^iXr — e) based on the r^'s and the z/j's are identical, Since they only make use of 
stopping times up to time T. The event C n merely ensures the occurrence of at least 1.1™ 
stopping times by time T. When G n holds, we have 

P( Xr = e) - P^Xr = e) 
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is less than EH and hence for the estimate based on the r,'s we have that 



p(x 



, - e) - P^Xr = e) 



(22) 



is also less than [2TJ But this is enough to make the reconstruction algorithm work. To 
see this, let a be the number 



a :-- 



P^(S r = n + 1) + P^(S r i [-n, n + 1] 



We have proved that when B n , C n , D n , F n and G n all hold, then 

P{Xr = e) - P^{Xr = e, S r e [-n, n]) > a 

if 



and 



Cn+i = e 



^(Xr = e) - P /U (x r = e, SV G [-n, n]) < a 



if ^ n+ i 7^ e. This means that the reconstruction algorithm works correctly, since it chooses 
as estimate for the color the value e G {1, 2, 3, 4, 5} which maximizes 



P(Xr = e) - P M (Xr = e, S; G [-n, n]). 



This completes the proof. 



4 High probability of the events B n , C n , D n , F n and 

Qn 

As a consequence of Lemma 13.11 we see that 

P(A nc ) < P(B nc ) + P(C nc ) + P(D nc ) + P(F nc ) + P(G nc ), 

so in order to show that A 11 occurs with probability close to 1, we need only show that 
each of B n , C n , D n , F n and G n occur with probability approaching 1 as n becomes large. 
We shall do this by showing that the probability that each of these events does not occur 
is finitely summable over n. 



4.1 The event B 



Let a 2 be the variance of the increment distribution of the random walk S. Conditions (1T1) 
and ([2]) ensure that a 2 is finite. Then, since St is a sum of i.i.d. increments, direct 



14 



application of Kolmogorov's inequality yields 



P(B nc ) = P( max \S t \ > 2.45 n + 1) 

0<i<T ' 



< ° 2 <a 2 ( (2.4/2.45)' 

" (2.45- + 1) 2 ^ / > 



Hence P(B nc ) is finitely summable and P(B n ) — y 1 exponentially fast as n — y oo. 



4.2 The event C n 

We define some auxilliary events to assist in proving that C n occurs with high probability. 
Let Cq be the event that the random walk has exited the interval [— (2.39) n , (2.39) n ] at 
time T: 

CI := {S T i [-2.39. n ,2.39 n ]}. 

Next, let C™ be the event that there are at least 2.3 n visits to the point k\ = n — n* 
before the random walk leaves the interval [—2.39™, 2.39 n ] for the first time. 

Define to be the event that among the first 2.3 n visits to k\ there are at least 1.1" 
for which the random walk subsequently takes n steps to the right. That is, if tj denotes 
the time of the z-th visit to k±, then, C% is the event that 

\{i G N : i < 2.3 n , S ti+S = ki + s, Vs G [0,n]}| > l.l n . 



It is easy to see that 

c n n cr n c 2 n c c n . (23) 

Thus, in order to prove that C n occurs with high probability, we need only prove that 
each of the three events Cq, C" and C"^ occurs with high probability. 



Proof that Cq occurs with high probability 

Let p denote the third absolute moment of the increment distribution of S and note that, 
as was the case for cr 2 , conditions and (j2j) guarantee p < oo. Then, fixing b = 2.39/2.4, 
we have 

P(CJ) = P(S T ^[-2.39 n ,2.39 n ]) = p(^^^[-b\b n ]/a 
= 1 - $ 2 . 4 2n (b n /a) + $ 2 . 4 2„ (-b n /a) , 



15 



where $t(x) := P ( *~^ /n < x ) . Let $ denote the distribution function of a standard 

v ; \at V2 - y 

Gaussian random variable. Now, $ t — >■ P/ii as t — > oo by the central limit theorem. Since 
b n — > 0, we see that 

lim P(C n ) = 1. 

Furthermore, 

l-P(C n ) = $ T (P n /(7) - $ T (-b n /<T) 

< <5>{B n /a) - $(-b n /a) + |$ T (S n /(r) - $(&7<r)| + 
\$ T (-B n /cr)-<$>(-b n /cr)\. 

By the Berry-Esseen theorem, 

f3p 



aH 1 / 2 ' 

for all x e R,t > 1, where (5 = 0.7655. Therefore, 

l-P(CJ) < 2^ /CT (2 7 r)- 1 /V« 2 / 2 ^ + ^. 
. Since J /l (27r)- 1 / 2 e- M2 / 2 = + o(h 3 ), we then have 

1 - P(C n ) < 2(2r;r> 2 ) 1 -//' + o(// ! ") + 2 ' Sf> 



2A n a 3 ' 



When n is large enough, the right-hand side will be bounded by 2(27r) 1 / 2 a 1 b n l 2 and so 
P(Cq) converges exponentially quickly to 1. 



Proof that C" occurs with high probability 

Let us imagine for a moment that the random walk S is simple, symmetric and starts at 
ki + 1. Let i] be the first time that S hits k x or 2.39": 

r] = min{t > : S t E {k u 2.39"} }. 

The random walk is a Martingale and hence we have that 

h + 1 = E[S ] = E[S V ] = h ■ P{S V = k ± ) + (2.39 n )P(^ = 2.39") > 2.39 n P{S v = 2.39"). 

(24) 

Hence, P(5' r? = 2.39") < and the simple random walk starting at k± + 1 has a 

probability of less then (kx + 1)(2.39)~" of hitting 2.39" before hitting k\. The same 
bound holds for a simple random walk starting at k± — 1 and its probability of hitting 
—2.39" before returning to k±. We note that it is possible to obtain a slightly better 
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bound, but this suffices here. Hence the number of visits to the point k\ by the random 
walk before hitting either —2, 39™ or 2.39 n is a geometric random variable with parameter 
p < (ki + 1)2. 39~™. It follows that the probability of making fewer than 2.3 n visits to the 
point ki before hitting the set {—2.39™, 2.39™} is less than (ki + 1) (^§,) • This imposes a 
bound on P(C" C ) that is negatively exponentially small in n if the random walk is simple. 
So we need to explain why the same kind of result holds for our random walk S whose 
increment distribution has exponentially decaying tails. 

This is done by considering a random walk S, which is generated by S and which makes 
steps of length at most n 2 . Whenever the random walk S jumps further than this, S does 
not move. Hence, So = S and for all t e N, 

g_g _ = (S t -S t -i, if \S t - S t -i\ < n 2 , 
* ' \ 0, otherwise. 

The key here is that the random walks S and S will very likely be identical within a time 
frame of interest. To capture this, we introduce C{\ which is the event that 

S t = S t , V* < 7™. 

Let C™ 2 be the event that the random walk S leaves the interval [—2.39™, 2.39™] no later 
than 7™. 

We define the stopping times U inductively. Let t\ — and be the first time no earlier 
than ti + n 4 that the random walk S visits the interval — [n 2 , n 2 ]: 

ii+i := min{t > U + n 4 : S t E [—n 2 , n 2 ]}. 

Let C™ 3 be the event that at least 2.38™ stopping times U occur before S leaves the interval 
[-2.39™, 2.39™]. 

Let be a sequence of Bbernoulli random variables marking visits to the point k\ at 

specific stopping times: Fj := 1 iff St i+n 4 = k±. 
finally, define C™ 4 to be the event that 

2.38™ 

Yi > 2.3™. 

i=i 

Note that 

c?i n c™ n c™ n c™ 4 c c™. (25) 

The following argument explains why this is so. Due to C™ 2 , the random walk leaves the 
interval [—2.39™, 2.39™] before time 7™ and by C™ 3 there are at least 2.38™ stopping times 
ti before S leaves the interval [—2.39™, 2.39™]. Thus, C™ 2 and C™ 3 together imply that at 
least 2.38™ stopping times ti will be seen before time 7™. The event C™ 4 forces at least 2.3™ 
of these 2.38™ stopping times ti to be followed by a visit to the point k\ at a time n 4 later. 
Hence, we have that if C™ 2 , C™ 3 and C™ 4 all hold, then prior to time 7™, the random walk 
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S visits the point k\ at least 2.3 n times before finally leaving the interval [—2.39™, 2. 39 n ]. 
The event CJ\ stipulates that S and S be identical up to time 7 n . Consequently, S also 
visits k\ at least 2.3™ times before leaving the interval [—2.39™, 2.39™] and so C™ holds. 

From the implication US] it follows that 

P{CT) < P(Cn) + P{C%) + P(C™ C ) + P(C™ C ). 

To prove finite summability of P(C™ C ) it is enough to prove it for each term on the 
right-hand side of this inequality 

Finite summability of P(C™f). Note that the event Cj\ holds as soon as the random 
walk S does not take any step of size larger than n 2 up to and including time 7™. Since 
the tail of the increment distribution of S is exponentially decaying, we have 

P{\S t -S t -i\ >n 2 ) < Cl e- C2n \ 

where c±, c 2 > are constants not depending on n. Thus, 

P(Cn) < ci7 n e- C2 ™ 2 

Provided c 2 is large enough, the expression on the right-hand side of this inequality is 
indeed finitely summable. 

Finite summability of P(C™ 2 ). To see the finite summability of P(C" 2 C ), first realize 
that 

C[ l 2 = {max \§ t \ > 2.39 n } D {S 7 n <£ [-2.39™, 2.39 n ]} 
and then emulate the proof that P(Cq c ) is finitely summable. 

Finite summability of P(C" 3 ). At any stopping time ij, the random walk S will be 
in the interval [— n 2 , n 2 ]. Since S can make steps of size no larger than n 2 , it can travel a 
maximum distance of n 6 from where it started in elapsed time n 4 . At time i{ + n A , S will 
therefore be in [— n 2 — n 6 , n 2 + n 6 ]. If S G [— n 2 , n 2 ] at time tj + n 4 , then tj +1 = ti + n 4 and 
so it won't have left [— 2.39 n , 2.39 n ] by On the other hand, suppose that S starts at 
a point x G (n 2 , n 2 + n 6 ] and let r\ be the first time that it enters [2.39™, 2.39 n + n 2 ] or 
[0,n 2 ]. Then 

x = E X0 [S ] = E X0 [S V ] = x 2 P X0 (S v E [2.39 n , 2.39 n + n 2 }) + x 1 P Xo (S r , G [0, n 2 }), (26) 

where x 2 is the conditional expectation of S v given that S v G [2.39 n , 2.39 n + n 2 ] and xi 
denotes the conditional expectation of S v given that S v G [0,n 2 ]. From equation [261 it 
follows that 

x > x 2 P(S v G [2.39 n , 2.39 n + n 2 ]) 



18 



and, since X2 > 2.39 n and Xo < n 2 + n- 6 , we have 

_ — I — ^6 

P{S V G [2.39", 2.39" + n 2 ]) < IL±£-. (27) 

An identical argument yields the same bound for the random walk hitting [—2.39" — 
n 2 , —2.39"] before [— n 2 , 0] given that S is in [— n 2 — n 6 , — n 2 ] at time tj + n 4 . Hence for 
all stopping times t iy the expression on the right-hand side of |57]also serves to bound the 
probability of the random walk leaving [—2.39", 2.39"] before paying a visit to [— n 2 ,n 2 ] 
after time U + n 4 . This implies that the number of stopping times U appearing before S 
exits [—2.39,2.39"] is a geometric random variable with parameter p < (n 2 + n 6 )/2.39". 
Hence, the probability P(C" 3 C ) of seeing fewer than 2.38" stopping times U before the 
random walk leaves the interval [—2.39", 2.39"] is less than 

This shows that P(C" 3 C ) is finitely summable. 



Finite summability of P(C" 4 ). Let Ti be the cr-algebra generated by So, Si, ■ ■ ■ , S^- 
Then by the Local Central Limit Theorem, we find that for n large enough, there exists 
a constant C4 > (not depending on n) such that 

P(St 1+n i = h I Fi) > ^ 

almost surely. (We use the fact that at time U the random walk S is located in the interval 
[— n 2 ,n 2 ].) This then implies that the sum 

2.38™ 



E 1 ' 



is bounded below by 



i=i 



2.38" 



E y <* 



i=l 



where Y* are i.i.d. Bernoulli random variables with parameter d/n 2 . Thus, we have 

(2.38™ \ /2.38 n \ 

J2 Y * < 2-3" = P I (1 - V) > 2.38" - 2.3" + 11. (28) 

Note that the (1 — Yj*)'s remain i.i.d. Bernoulli but have parameter 1 — c^/n 2 . The second 
non-central moment of 

2.38™ 



i=l 
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is equal to 2.38™(1 — c^/n 2 ). Applying Chebychev's inequality without centring yields 

/2-38" \ 2 38 n fl - c In 2 ) 

P V(l - Y*) > 2.38 n - 2.3 n + 1 < ^ ^—J- < 2.38- n (2.38/0.08) 2 , 

1 ' ~ j ~ (2.38™ - 2.3™) 2 " 

which is finitely summable over n. From Equation [281 this expression also bounds P{C^l). 
Proof that C% occurs with high probability 

Recall that £j denotes the i-th visit by the random walk S to the point k\. We will define 
a subset {kx, k,2, . . .} of the t^s inductively as follows. Fix K\ := £i and, for i > 1, let Kj+i 
be the first tj after time K{ + n: 

:= min{tj > K{ + n : j G N}. 

The Ki's form a strictly increasing sequence. Note that, among the first 2.3 n stopping 
times ti, there are at least (2.3) n /n stopping times «j. Let Y^ be a Bernoulli variable 
which is equal to one iff the random walk takes n steps to the right immediately following 
time Kf. 

Yi '.= lSt i + s =k 1 +s,VO<s<n- 

The variables Y l ,Y 2 , ... are i.i.d. and P(Y* = 1) = a n , where (1 - e)/2 < a < 1/2. Let 
Cg be the event 

2.3" /n 

X) ^ > Lin - 

1=1 

Then since among the first 2.3 n stopping times ti there are at least 2.3 n /n stopping times 
Ki, we see that Cg implies ■ Together with Chebychev's inequality, this yields 

(2.3" /n 

(2.3" In 
( Y i - EfTj) > 1.1" - 2.3 n a n /n 



(2.3" /n 
^ (Ki-Etyy) > i.r-i.i57n 

< 2 - 3 " ; O- 5 "/^ = 1 15 - w (1 _ ra (l,l/l.l 5 )»)-2 . 

~ (1.15 n /n-l.l n ) 2 

Since ra(l.l/1.15) n | as n oo, P(C 2 nc ) < 529n ■ 1.15~ n and so P(" c ) is indeed finitely 
summable. 
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4.3 The event D 



Let R : [0,n] — > Z be a piece of length n taken from the sample path of S. In order to 
prove D n occurs with high probability, we must first obtain a bound on the probability 
of R being a 5-path. towards this end, define 

D'q := {R : [l,n] ->• Z is a 5-path}. 

Also, set := \R i+1 - Ri\ ■ l\Rt +1 - Ri \^i and Y, : = I^-r^i. The X;'s represent the 
sizes of jumps over distances greater than unity while the Y^'s indicate those times at 
which a non-unit jump occurred. Note that Xi = Yi = whenever the 2-th jump is of 
unit length. Observe that by definition, Dq = D™ c D D%°, where 

fn-l 

D" := <j J2 X i > 6n 

and 

D% := ^J2 Y i > 6n 

Therefore, P(D$ C ) < P(D?) + P(D%). Our aim is to show that P(£>?) and P{D%) can 
be bounded exponentially small as n becomes large. 

From the moment generating function of X i} we have 

oo 

E[e sX >} = P(X i = 0) + J2e sj P(Xi = j) 



i=i 



n—l 



i=2 

e 



J'=2 
oo 



e -2(c-s) 



= e 1 + 



1 - e -( c ~ s ) / ' 

Applying Churnov's inequality, we have 



'n-l 



P(PV = P[J2^>S(n + l) 

\j=0 

< E J e «(E"=o^-«(»+i))J 

5(n+l) a ^1 + ^ 



2(c-s) 



= e 

-2(c- S ) 



1 - e-( c ~ s ) 



1 - e -( c - s ) 
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for all s > 0. Since fz^^T) is strictly increasing on [0, oo), we see that 



— c T n 



where c + := — In ^1 + — lne. Observe that c + can always be made positive and as 

large as we like by choosing e sufficiently small. In particular, e can be chosen to ensure 
that c+ > d := c + 2 In 2.4 + In 2, whence P(Df) < e~ c ' n for all n. 

Next, 

E[e sYi ] = e s P(\R i+1 - Ri\ ^ 1) < ee s . 
Applying the same standard large deviation argument used above, we obtain 

(n-l N 
3=0 

< e - 5(n+1)s (E(e sYl )) n < e ns e n , 

for all s > 0. Since e n s is strictly increasing in s, we have P(D%) < e n < e~ c ' n , provided 
e is once again chosen small enough. 

Now, the probability of seeing a 5-path in a piece of S of length n can be bounded 
below by 

P(D%) > 1 - P(d?) - P(£>2) > 1 - 2e_c ' n = 1 - e ~ c " n , 
where c" = c + 2 In 2.4. Therefore, P(Dq) /* 1 exponentially fast as n — > oo. 

Now, D n may be expressed as the intersection 

T 

d n = pi 

where -D n,s is the event that the segment 

S : [s - n, s] ->■ Z 
of the sample path of 5 1 is a 5-path. Then, 

T T 

s=n s=n 

< (T-n+ l)e- c " n < 2A 2n e- {c+2ln2A)n = e~ cn , 
which shows that P(D n ) — > 1 exponentially fast. 
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4.4 The event F 



Let R : [0,n] — >■ Z denote a 5-path of length n. Define F^ to be the event that for all 
x G [— 2.45 n , 2.45 n ] \ [— 2n — 5n, 3n], there exists no S- path R such that 

f o R = W and R(n) = x. (29) 

Define F" to be the event that for all a; G [n* + 5n, 3n] there exist no 5-path F satisfying [291 
and let F% denote the event that, for all x G [— 2n — 5n, n* — 215n], there exists no 5-path 
R satisfying [291 Then let Fj 1 be the event that for any 5-path R : [0, n) [-2.45™, 2.45 n ] 
satisfying £ o R = w, we have R(n) G J. It should be clear that 

F n n f? n f™ c f;. (30) 

Let be the event that for any 5-path R : [0, n) — > [— 2.45 n , 2.45 n ] satisfying £ o R = w, 
we have F(0) G J~ where 

J - := [n* — n — 5n, — n + 215n] . 
Note that for any 5-path R : [0, n] — > Z with F(0) G J~ and J?(n) G J, we will have 

n* - n + (1 - S)s - 25n < R(s) < n* - n + (1 - 5)s + 25n , Vs G [0, n]. (31) 
Fixing 5 so that 635 < 1, (13TT) implies that F(s) G [— n, n] for all s G [0,n\. Consequently, 

p n nF n cF n 

and hence 

F(F nc ) < P(F™) + P(F™). 

The proof that F™ occurs with high probability is similar to that for Fj and we shall 
leave it to the reader. 

To prove the finite summability of P(Fj c ) we use [30] which implies that 

F(F" C ) < F(F nc ) + F(F 1 " C ) + F(F 2 nc ). 

Hence, we only need to prove high probability of the events F ", F" and F 2 ". The key to 
proving this is the following lemma: 

Lemma 4.1 Let x be a point. For every n, the total number of 5-paths of length n starting 
at x is less than 

2n(l+H(5)+25) 

Define a (k, 5) -path of length kn to be a path comprising a total of kn steps of which fewer 
than Sn are non-unit and for which the total variation of the non-unit steps is also less 
than Sn. Clearly, all (1, 5)-paths of length n are 5-paths of length n and conversely. Then, 
the number of (105, 5)-paths of length 105n starting at x is less than 

2l0<5n(l+_ff(0.1)+0.3) 
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Proof. First, if all steps are either +1 or —1, we have at most 2 n choices, since the 
path makes at most n jumps of length 1. Then among the n steps, we must consider which 
of them are not of unit length. There are at most Sn such steps and, since 5 is taken to be 
small (in particular 5 < 1/2), we shall have fewer than Q") possible arrangements of the 
non-unit jumps among the unit jumps. An argument involving Stirling's approximation 
reveals the bound 2 H ^ n on the number of possible arrangements, where the entropy H(5) 
is given by H(8) := -8\og 2 5 - (1 - 8) log 2 (l - 5). 

Next we need to take into account the length of each non-unit step. We know that 
the total variation of the non-unit jumps is no more than 5n. Hence, we can partition 
an interval of length at most 5n to determine the lengths of the non-unit steps which are 
not of length 0. This gives a maximum of 2 5n possibilities. Because we can have steps 
of length zero, we must also determine which of the non-unit steps have length zero and 
this gives another 2 Sn possibilities at most. Finally, we need to account for the signs of 
the non-unit steps. Since there are only two possible signs, There are at most 2 5n choices. 
Multiplying all the preceding bounds yields an upper bound on the maximum number of 
5-paths of length n which start at x: 

2»i(l+-ff(<5)+35) 

To compute a bound on the number of (105, <5)-paths of length 10<5n starting at x, 
we proceed as before. Determining the steps of size +1 and —1, we have at most 2 10<5n 
choices. Then we need to factor in the steps which are not of unit-size. There are at most 
Sn of these, which corresponds to a proportion of no more than 0.1 the length of the path. 
Hence, we have at most 2 WSnH( - 0A ^ choices. Finally, determining the length and sign of 
the non-unit steps as before gives a maximum of 2 35n choices. Combining these bounds 
in a product, we find that there are less than 

2l05n(l+if(0.1)+0.3) 

(105, 5)-paths of length 10<5n starting at x. m 

Proof that F " occurs with high probability 

Let x belong to the set 

X := [-(2.45)™, (2.45) n ] \ [2n - 5n, 3n\. 

Now, n + 5n — 1 is the maximum distance a 5-path can travel in n steps. So any 5-path of 
length n which visits x will stay outside [— n, n\. Thus, if R : [0, n] — > Z is a non-random 
5-path such that R(n) = x, then £ o R is independent of w. To see this recall that The 
scenery is i.i.d. and so disjoint parts of it are independent of each other, note that the 
pattern w is a substring of £_„£_ n+ i . . . £ n while £ o R only depends on parts of the scenery 
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outside [— n, n]. Furthermore, the string w is i.i.d. with each of the five possible colors 
appearing with probability 1/5, that is, 

P(£oR = w) = (l/5) n , 

where R is a non-random <5-path of length n ending at x. Since there are at most 
2n(i+H(S)+28) j.pathg f length n ending at x, we get 

/ol+H(5)+25\ n 

P(F&) < [ J (32) 

where Fq x designates the event that there exists no 5-path of length n for which £ o R = w 
and R(n) = x. Finally, Fq may be expressed as 

Fq = n xe xFo >x . 

Hence, 

< g— ) =2(- g ) • (33) 

since X contains fewer than 2 • 2.45™ elements. Now, H(5) + 25 converges to zero as 5 
tends to zero. Hence, by taking 5 > small enough, we see that 

2 45 ■ 2 1+H ^ +25 
5 

can be made as close as we like to 2.45 ■ 2/5 = 0.98 < 1. Fixing 5 > small enough so 
that expression 1541 is strictly less than 1, we have that inequality 1221 constitutes a negative 
exponential bound for P(Fq c ). 



Proof that F" occurs with high probability 

Let x be a non-random integer such that 

x E [n* + Sn,3n]. (35) 

Assume also that R : [0, n] — > N is a non-random £-path such that R(n) = x. Note that 
because R is a 5-path, R(n — i) is never further from x than 8n + i and hence because of 
l35l we have R(n — i) > n* —i. It follows that the n — i-th letter of the word w, which we 
denote by w(n — i) and which is equal to £ n *_j, is independent of 

i{R{n-i))aR{n-i + l))...aR{n)). 
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This holds for all % = 0,1, ... ,n. Hence we find that if [35] holds with R(n) = x for a 
non-random 5-path R, then 

P(£oR = w) = (l/5) n . 

Next, we synthesize the proof that Fq occurs with high probability. First let F™ x be the 
event that there exists no 5-path of length n ending in x and generating w. Since the 
number of such paths, according to Lemma I4.1[ is no greater than 2 n{l+H{ ^ +2& \ we have 



/o(l+H(5)+25)\ '< 

p iK*) < ( — 5 — ) 



Finally, there are fewer than 3n points x in the set [n* + 8n, 3n] and hence the last 
inequality above yields 

'2(l+H(<5)+2<5) 



P(Ff c )<3n^ j . (36) 

Once again choosing 5 > small enough, we have 

(1+H(S)+2S) 

whence inequality [36] gives the desired negative exponential bound on P(F 1 nc ). 



Proof that F£ occurs with high probability. 

Let x G [— In — Sn, n* — 218n]. Assume that R : [0, n] — > Z is a 5-path such that R(n) = x. 
Note then that for all % < 105n we have 

R(n — i) <x + i + 5n<x + 105n + 5n < n* — lOSn. 

Hence 

£(R(n - 106n))£(R(n - I05n + l))£(R(n - I05n + 2))... £(R(n)) 
is independent of 

£n*-10<5n£n*-10<5n+l • • • £n* 

which corresponds to the last 105n letters of the word w. Hence, we get 

5 



P(f o R = w ) < ( - 



By Lemma fl~Tl there are no more than 2 10<Sn ( 1+H ( ai ) +0 - 3 ) 5-paths of length 105 ending in a 
given point x. Let F£ x be the event that there is no 5-path of length n such that R(ri) = x 
and £ o R = w. Then 

/Ol0<5(l+H(0. l)+0.3) \ n 

P (FZ) < [ 5 
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for all x G [— 2n — 5n,n* — 218n]. Since there are no more than 3n points in [— 2n — 
Sn, n* — 215nl, we find 



P(P 2 nc ) < 3n 



2l+i?(0.1)+0.3 



105n 



(37) 



Now. 



2 l+H(0.1)+0.3) 



is strictly less than 1 so expression [37] provides the desired exponential 



bound on P(P 



4.5 The event G n . 



We shall use fi to denote the empirical distribution of 



CO c 
u Ul ; U V2 1 • • • 1 u Vj 1 



where the random variable j is the largest % such that z/j < T = 2.4 n ; that is, 



1 3 

AO) :=li{{x)) = -^ls„ 

■3 a— n 



i=0 

Let Gi denote the event that the difference between jx and fx is less than or equal to 
r?l 2 /l.V 1 / 2 in total variation: 



{. „3/2 j 

|/2(a:)-M^I<4^[ 
x6[— n,n] ' J 



Next, set G£ := P n fl C n fl D" PI F n . Clearly, we can write P(G? C ) < P(G? C | 
G%)+P(G% C ). Since P(G" C ) < P(P nc )+P(C nc )+P(P nc )+P(P nc ) and the 4 quantities on 
the right-hand side are finitely summable, P(G2 C ) is itself finitely summable. Therefore, 
to show that P(G™ C ) is finitely summable, it is enough to prove finite summability of 
P(G™ C | G%). Towards this end, observe that the event G 2 implies that at least 1.1" 
stopping times Ui manifest by time T and that each of these times marks the end of a 
5-path of length n in I whose final position lies in J. As a consequence, ft has support 
on J C / where \J\ = 225n + l<n<2n + l = |/| for 5 > sufficiently small. Now, 
XLeJ lA( x ) — v( x )\ — nsu PxeJ — M x )l- Conditioning on j = k, we have 



p(X>0)-M*)l>-S^ 



< P ( sup |/i(x) — /i(x)| > 

,^gj VI. 1 



{j = ^} n G'i 
{j = k}nG n 2 



= P ( U xeJ {\jl(x) -fx(x)\ > 
< P I absfi{x) — /u(x) > 



1.1 



} 



0' = fc}nG5 
{ 3 = k}nG n 2 ). 



(3? 
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Next, define Y (x) = and Yi(x) = Y^k=i-^i( x )-> f° r ^ G N, where Xi(x) = ls v .=x — n(x)- 
Now Yi(x) is a Martingale with increment size bounded by 1 and Yk(x) — Yq(x) = k(£i(x) — 
[i{x). An application of the Azuma-Hoeffding inequality shows that 



P ( fx(x) — > 



{j = k}n<%\ = p(Y k (x)-Y (x)>^^ 

< e"5Frk < e -0.5nfc/l.l»_ 

Similarly, P (ji(x) - fi(x) < {j = k} (1 G%) < e -°-5"*/i-i" so that 



{ J = k}nG n 2 



P[\fi(x)-fi(x)\ > 



'i.r 



{j = k}HG^j < 2 e -°- 5nfe / 1 - 1 ". 



Combining this bound with [23 then yields 

n 3/2 



{j=fc}nG™ < 2 | J| e -°- 5 «vi-i" < 2 



-0.5nfc/l.l r 



Since C n guarantees j > 1.1", will belong to [l.l ra ,T] and we find that 



T 



P{G 



nc 
1 



Gn) 



fc=i.i" 

T 

< X 2ne- a5nfc / 1 - 1 "P(j = k \ G%) < 2ne~ n/2 , 



fc=i.i r ' 



which is finitely summable. 

Now, if it could be shown that n G% C G"\ then we would have P(G nc ) < P(G^ C ) + 
PiGVf)- However, we have already shown that the two terms on the right-hand side are 
finitely summable, whence P(G nc ) would be finitely summable. 

So, it remains to show that G\ PI G% C G n for n large enough. In order to do this, we 
need to show that 



P»{S r = n + l)- P^(S r j [-n, n + 1]) > fl-e 



908n+2 



(39) 



Recall that r is defined to be 905n. We shall take 5 > small enough so that (2/(1 — e)) 90<5 < 
y/l.l. Then for all n large enough, the bound on the right side of [39] is strictly larger 
than r?l 2 jX.Y 1 ! 2 . However, the event G™ guarantees ra 3//2 /l.l n / 2 as an upper bound on the 
difference in total variation between the measure \x and \i. In turn, the difference in total 
variation between jl and \i bounds the difference between Pp,(x r — e ) an d P^iXr = e) for 
every e G {1, 2, 3, 4, 5}. Thus, we have: 



n 3/2 

\Pa(Xr = e) - i^(Xr = e)| < X ~ < 



(40) 
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Note that -P M (Xr = e) = P^Xr = e). Therefore, if (I59"|) holds, 



£|*(Xr = e) - P M (Xr 

will be bounded by the quantity in ( 12TT) for n large enough and hence G n will hold. 

To finish, we return to the question of (1591 holding. Since G"^ holds, the support of fi is 
in the interval J. Let us first assume that the random walk S is aperiodic and symmetric 
with the increment distribution given in ffTTj) . We will deal with the more general case 
later. Now, S is able to reach the point n+1 from any point of J in r steps with probability 
greater than or equal to (^) r = (^p) The minimum distance from any point x G J 
to the point n + 1 is 605n. Using ( I12jl with b = QOSn and q = 1.5, we calculate the decay 
in the tail of the state probability distribution beyond a distance of 605n from x after 
905n steps to be 

PV ' ; P(S r = n + l) 1.5 + 1 5 
Hence, for any point x G J, we have 

P x (5 r = n + 1) - P x (S r > n + 1) - P x (S r < -n) > (1 - 0.25 - 0.25)P x (S r = n + 1) 

= 0.5P x .(S r = n + l), 

but as mentioned above, P x (S r — n + 1) > ((1 — e)/2) r = ((1 — e)/2) 90<5ri and hence we 
obtain 

1 — e 



= n + i) _ p x ( Sr <£ [-n,n + l]) > 

Here -P x ( - ) refers to the distribution when S is started at the point x. As the event G\ 
holds, the measure /ihas support on the interval J and so 



P,(S r = n + l | F n )-P,{S r i\-n,n + l] \ F n ) > 



90<5n+l 



Dividing both sides by two then yields f )39|) . 

While the veracity of fl39|) has been discussed assuming the aperiodic, symmetric ran- 
dom walk of Section 12. 2[ it is not too difficult to show that it holds for any random walk 
satisfying Conditions ([I])- (jjj) given G%, provided we select the parameters of the random 
walk appropriately. In other words, for large n, G\ D G\ C G n when e > is sufficiently 
small and c > is sufficiently large. 



5 Reconstruction of the whole scenery 

To conclude, we prove that we can reconstruct the whole scenery £ a.s. up to equivalence, 
for this we use the following lemma which was proved in |17j . 
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Lemma 5.1 Assume that there exists an algorithm which reconstructs £ with probability 
strictly greater than 1/2. Then we can also reconstruct £ with probability one. More 
precisely, if there exists a measurable map 

.4*:{1,2,3,4,5} N ^{1,2,3,4,5} Z 

such that 

P(.A*(0 « > 1/2, 

then there exists a measurable map 

A: {1,2,3,4,5} N ->{1,2,3,4,5} Z 

such that 

HA(£) »£) = !• 

So, we need to build a map which reconstructs £ with probability strictly greater than 
1/2. Due to inequality [5j there exists a non-random uq such that 

P (i(n + 1) = £(n + l),|(-n - 1) = £(-n - 1), Vn > n ) > ^. (41) 

We tune the parameter e > small enough so that the random walk S only makes ±1 steps 
for a long time. It is known that we can reconstruct a finite piece of £ close to the origin 
(see [17]) with probability as close to one as we like when we are dealing with a simple 
random walk on a five-color scenery provided we have enough observations. So by taking 
e > small enough we can ensure that the random walk follows a simple random walk 
path long enough to reconstruct the finite piece £l[-n ,n ] with probability larger than ~. 
So, we start by reconstructing the finite piece £l[-n ,n ]- Then we proceed inductively in n 
for n > n . Once we have reconstructed £|[-n,n] we estimate £(n + 1) and £(— n — 1) using 
the algorithm described in Section 13.21 In this way we end up reconstructing £ correctly 
with probability at least 3/5 which is strictly larger than 1/2. This establishes that we 
can reconstruct £ with probability one and thus completes the proof of Theorem 11.11 

One more remark: Since in general one can not reconstruct the scenery but only recon- 
struct it up to equivalence, it follows that we can not reconstruct £|[-n,n] exactly. Instead, 
from the result of matzinger [T7j, we have that we can only perform reconstruction suc- 
cessfully with high probability on an approximately centered interval £|[_ n+ / jn+ /] where I is 
small compared to n, but not known to us. Note however that our one-point reconstruc- 
tion algorithm works if we are given the string £(— n + n + 1 + 1) . . . £(n + /) without 
knowing its exact position (i.e. not knowing /), but only that / is of order smaller than n. 
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